Method Mention Extraction from Scientific Research Papers
نویسندگان
چکیده
Scientific publications contain many references to method terminologies used during scientific experiments. New terms are constantly created within the research community, especially in the biomedical domain where thousands of papers are published each week. In this study we report our attempt to automatically extract such method terminologies from scientific research papers, using rule-based and machine learning techniques. We first used some linguistic features to extract fine-grained method sentences from a large biomedical corpus and then applied well established methodologies to extract the method terminologies. We focus the present study on the extraction of method phrases that contain an explicit mention of method keywords such as (algorithm, technique, analysis, approach and method) and other less explicit method terms such as Multiplex Ligation dependent Probe Amplification. Our initial results show an average F-score of 91.89 for the rule-based system and 78.26 for the Conditional Random Field-based machine learning system.
منابع مشابه
Empirical Evaluation of Crf-based Bibliography Extraction from Research Papers
We proposed an automatic bibliography extraction method for research papers scanned with OCR markup. The method uses conditional random fields (CRFs) to label serially OCRed text lines in the article title page as appropriate bibliographic element names. Although we achieved good extraction accuracies for some Japanese academic journals, extraction errors are inevitable. Therefore, this paper p...
متن کاملAn Approach to Content Extraction from Scientific Articles using Case-Based Reasoning
In this paper, we present an efficient approach for content extraction of scientific papers from web pages. The approach uses an artificial intelligence method, Case-Based Reasoning(CBR), that relies on the idea that similar problems have similar solutions and hence reuses past experiences to solve new problems or tasks. The key task of content extraction is the classification of HTML tag seque...
متن کاملCitation-Enhanced Keyphrase Extraction from Research Papers: A Supervised Approach
Given the large amounts of online textual documents available these days, e.g., news articles, weblogs, and scientific papers, effective methods for extracting keyphrases, which provide a high-level topic description of a document, are greatly needed. In this paper, we propose a supervised model for keyphrase extraction from research papers, which are embedded in citation networks. To this end,...
متن کاملDirectly e-mailing authors of newly published papers encourages community curation
Much of the data within Model Organism Databases (MODs) comes from manual curation of the primary research literature. Given limited funding and an increasing density of published material, a significant challenge facing all MODs is how to efficiently and effectively prioritize the most relevant research papers for detailed curation. Here, we report recent improvements to the triaging process u...
متن کاملExtraction of Semantic Relationships from Academic Papers using Syntactic Patterns
Integrating concept and citation networks on a specific research subject can help researchers focus their own work or use methods described in prior works. In this paper, we propose a method to extract semantic relations from concepts and citation in the descriptions of related work. Specifically, we examined (i) topic-paper relations between research topics and reference papers and (ii) method...
متن کامل